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Abstract 

The commented article attempts to prove a "General theorem" giving 
sufficient conditions under which a previously introduced "general condi- 
tioned average" "converges uniquely to the quantum weak value in the 
minimal disturbance limit." The "general conditioned average" is ob- 
tained from a positive operator valued measure (POVM) {Ej{g)}"^i de- 
pending on a small "weakness" parameter g. We point out that unstated 
assumptions in the presentation of the "sufficient conditions" make them 
appear much more general than they actually are. Indeed, the stated "suf- 
ficient conditions" strengthened by these unstated assumptions seem very 
close to an assumption that the POVM operators Ej (g) be linear poly- 
nomials (i.e., of first order in g). Moreover, there appears to be a critical 
error or gap in the attempted proof, even assuming a linear POVM. A 
counterexample to the proof oi the "General theorem" (though not to its 
conclusion) is given. Nevertheless, I conjecture that the conclusion is ac- 
tually true for linear POVM's whose contextual values are chosen by the 
commented article's "pseudoinverse prescription". 

1 Relation between traditional "weak measure- 
ment" theory and the "contextual value" ap- 
proach of [1] 

1.1 General introduction 

This is an expanded version of a paper submitted to J. Phys. A commenting on 
[1] (called DJ below). DJ attempts to refute counterexamples given in [6] to a 
"General theorem" (GT). The "Comment" paper discusses the validity of these 
counterexamples and gives a new counterexample to the proof (though not to 
the conclusion) of the GT. 

*For contact information, go to|http: / /www. math. umb.edu/~sp| 
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The submitted paper had to be written more tersely in order to keep its 
length appropriate for a "Comment" paper and may be incomprehensible to 
anyone not already familiar with DJ. This expanded version adds the present 
Section 1 introduction together with an appendix which analyzes in detail some 
logical problems with the stated hypotheses of the GT. 

For orientation I first give a brief description of my view of the notion of 
"contextual values" , introduced in [2] (called DAJ below |3 and expounded by 
Dressel and Jordan in more detail in [3^1 and the paper DJ under review [1]. 
My views are presented in more detail in [5pl and [6] . The review assumes that 
the reader has some familiarity with the ideas of weak measurement. A more 
leisurely exposition of these can be found in [7] and references cited there. To 
minimize confusion, I try to use the notation of DJ wherever practical, even 
though it is not the notation that I would choose. 

One minor exception is that I usually write {u\Av) instead of as in 

DJ's (1.1). The reason is that all the type was set before noticing this small 
difference, and attempting to change it risks more confusion than retaining it 
in case some instances which should be changed go unnoticed. 

1.2 Weak measurement 

"Weak measurement" , introduced in 4 , is in part a technique for measuring the 
expectation of a quantum observable A in a given state s without appreciably 
changing the state. This can be accomplished as follows. 

Suppose the observable A operates on a Hilbert space S. Couple S to an 
auxiliary "meter space" M, obtaining a new Hilbert space S* Af which is the 
tensor product of S and M . 

With each state s of S, associate a slightly entangled state IJs ^ S ® M, 
where [/ is a isometrjlf] from S to S ® M . Find a "meter observable" B on 
M such that the expectation oi I ® B m the state XJs is almost the same as 
the expectation of A in the state s, where I generically denotes the identity 
operator on whatever space is relevant in the context (in this case M). 

To make this precise, introduce a small real "weak measurement" parame- 
ter g with [/ = tJ{g) depending on g . In terms of this parameter, "slightly 

^ The reader should be warned that DAJ is vaguely written with many errors and omissions 
of important definitions and hypotheses. 

^[3] is written in an unusual, complicated notation different from both DAJ and DJ. ft 
contains what the authors characterize as a "slight generalization" of the "General theorem" 
of the first arXiv version of DJ. The discussion of the GT in |3] gives no indication that its 
validity is disputed in 6 . 

^ The reader should be warned that the six versions of [5] were written over a period of 
months as I tried to make sense of the vaguely written and error-ridden DAJ, and an evolution 
of its ideas from earlier to later versions will be apparent. The presentation may seem unusual 
in that introductions to the later versions were simply prepended to rewritten earlier versions. 
The presentation of [6] is more concise. 

■'An isometry C/ is a linear transformation which preserves inner products: {ljv\ljw) = 
{v\w) for all V, w. The only difference between an isometry and a unitary operator is that an 
isometry need not be surjective (i.e., "onto"). 
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entangled" is interpreted as 

lim U{g)s — s (E) m (1) 

g->-o 

(an unentangled product state, where m is some state of m). 

Denote the projector onto the subspace spanned by a nonzero vector v as 
Py . A routine calculation shows that the preceding paragraph guarantees that 
the (mixed) state of S corresponding to U{g)s, namely TtmP u(g)s where Ttm 

denotes partial trace, approaches Pg (i-e., the original pure state s written in 
mixed state notation) as 5 — J> 0. Similarly "the expectation of J® -B in the state 
tJ{g)s is almost the same as the expectation of A in the state s" is interpreted 
as holding exactly in the limit 5 — 0: 

\im{U{g)s,{l<E>B)U{g)s) = {s\As) . (2) 
9-^0 

The mathematics of [T| can easily be made rigorous only under the assump- 
tion that all Hilbert spaces occurring are finite dimensional, so that all observ- 
ables have discrete spectra. Let denote a collection of orthonormal 

eigenvectors of B, with aj the corresponding eigenvalues, which we assume dis- 
tinclH for expositional simplicity. We shall allow the aj = aj{g) to depend on 
g, which implies that B = B{g) also depends on g. In principle, one could 
also allow the eigenvectors fj to depend on g, but for simplicity we assume that 
they are constant. (This assumption can be justified by making appropriate 
identifications.) 

Write 

i7(.g)s-^M,-(.g)s®/, , (3) 

j 

where this defines the "measurement" operators Mj{g) on S. These measure- 
ment operators define a positive operator valued measure (POVM) {Ej{g)} on S 
by Ej{g) :— Mj\g)Mj{g). The probability P{j) that a measurement of I ®B 
in state lJ{g)s will produce result j is the norm-squared of the /^-component of 
©: 

P{j)^\M,{g)s\'^{s\E^{g)s) . (4) 

From (g]), the expectation {U{g)s, (J ® B)U{g)s) of J ® B in the state U{g)s 
is 

(C7(g)s,(/®B)i7(.g)s)=^a,P(j)=5]a,(5)(s|^,-(g)s) . (5) 

3 3 

Since the probabilities P{i) depend only on data in 5, by allowing the eigen- 
values aj = aj(g) to depend on we might hope to choose them to satisfy the 
desired relation 

{U{g)s, {I ® B{g))U{g)s) = ^ a,{g)P{]) = {s\As) , (6) 

3 



^The slightly subtle reason is discussed in [5]. 



4 



which says that the expectation of the system observable A in the state s could 
also be obtained by measuring the expectation of the meter observable B{g) in 
the state U{g)s. The advantage of measuring B{g) instead of A is that for small 
g, the (unnormalized) state of S after measurement result j is obtained, namely 
Mj{g)s, is very close to s because of Traditional weak measurement theory 
shows how to choose lJ{g) and aj{g) to obtain © in the limit g ^ 0, but it 
does not give ([6]) as an exact equation for small but nonzero g. 

The above discussion sketches a formulation of weak measurement theory 
which gives a more or less direct translation into the language of DAJ and DJ. 
Weak measurement theory is traditionally formulated in terms of a "system" 
Hilbert space S and a "meter" space M with a "meter observable" B. The 
setup of DAJ and D J replaces the meter space and meter observable by a set of 
measurement operators on S. The eigenvalues aj{g) of the meter observable B 
are renamed "contextual values" H 

Although it seems clear that any statement about the meter observable can 
be translated into a statement about measurement operators in S and con- 
versely, there are significant differences between traditional weak measurement 
theory following [4] and the contextual value theory of DAJ and DJ. For exam- 
ple, unlike contextual value theory, traditional weak measurement theory does 
not attempt to obtain (|6|) for all g, but only in the limit 17 — > 0: 

]im T c^Mm = {s\As) . (7) 

] 

DAJ and DJ assume (jG]) (expressed in terms of measurement operators), i.e., 

^a,P(j) = ^a,(5)(s|4-(g)s) = (s|As) , (8) 
j j 

for all small g 0, a very strong assumption. Given tJ{g) (equivalently, given 
the measurement operators) , it is often not possible to choose contextual values 
aj{g) satisfying the strong hypothesis ([5]). 

For this and other reasons, the claims of DAJ and DJ that their contextual 
value formalism "subsumes" the traditional weak value formalism seem open to 
question. For example, DJ writes: 

"The formalism is powerful enough to subsume strong measure- 
ments, weak measurements, and any strength of measurement in 
between." 

This is true only because they have added the strong hypothesis ([S]) that contex- 
tual values can always be chosen for all positive g. If the same hypothesis were 
added to traditional weak meaurement theory, that theory would also apply to 

^The correspondence between projective measurements (identified with observables) in 
S ^ M and measurement operators in S is of course well known (c.f. the text ^9 , section 
2.2.8). Indeed, I suspect that it may be what DJ is talking about in the second paragraph of 
its section 2. (It is hard to be sure because their symbol Usd is not defined.) 
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strong measurement and "any strength of measurement in between" . In the 
other direction, though contextual value theory as formulated in DAJ and DJ 
does not apply to cases when contexual values do not exist for all g, it could 
probably be reformulated to apply with the weaker hypotheses ([7]) in place of 
([U, though at the expense of additional complication. 

One way in which contextual value theory might be argued to be more 
general is that the original formulation of weak measurement theory in [4] and 
much of the subsequent literature assume a particular form for U{g)s, namely 

U{g)s = exp{igA ® P)s ® m , (9) 

where J' is a particular operator on a particular meter space M . However, the 
formulation of weak measurement theory sketched above does not require this 
hypothesis: U{g) does not need to be of the form ([9]). 

An advantage of traditional weak measurement theory using ([9]) is that it 
gives a method to weakly measure any observable. If it were required that ([6]) 
hold for allg, it would not be obvious that a weak measurement procedure would 
exist. The same problem arises in the setup of DAJ and DJ, but although not 
yet explicitly addressed by the authors (so far as I know), I would expect it to 
be easy to solve under their assumption of finite dimensionality. 

Traditional measurement theory is formulated in a system+meter space S ® 
M. I view contextual value theory of DAJ and DJ as a simpler formulation 
in system space S alone which is less general as developed in DAJ and DJ 
but could probably be reformulated to become essentially equivalent in finite 
dimensions. I think claims that contextual value theory is more powerful are 
questionable, but it does have the very attractive advantage of simplicity. It 
would be unfortunate if inadequately researched and overstated claims turn out 
to obscure its genuine merit of conceptual simplicity. 

1.3 Postselection 

Most applications of weak measurement theory involve more than mere weak 
measurement as described above. Typically, after making a weak measurement 
one "postselects" to a given final state s/ G 5. This means that one performs a 
second projective measurement (with respect to the orthogonal decomposition 
{Psfjl — Psf}) to see if after the first measurement, the system is in state s/ 
( "success" ) or a state orthogonal to s/ ( "failure" ) (It would take us too far 
afield to explain why one might want to do this.) This is done repeatedly start- 
ing with the same initial state s, and only the results of "successful" trials are 
retained. The (conditional) expectation of the meter measurement given suc- 
cessful postselection is called a "weak value" of the system observable AE This 
conditional expectation (with "meter measurement" replaced by "measurement 

^For simplicity of exposition we restrict attention to postselection to a pure state, as does 
DJ. DAJ considers postselection to a mixed state, but does not explain how this could be 
physically accomplished. 

*The "weak value" is often confused with the conditional expectation of A (instead of the 
meter observable B), and it is important to keep the distinction in mind. The conditional 
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using given measurement operators") is DAJ's "general conditioned average", 
which is routinely calculated. 

Weak values are not unique; in general they depend on the measurement 
procedure. The seminal paper calculated (via questionable mathematics) a 
particular weak value for a particular measurement procedure. Since this cal- 
culated weak value is generally nonreal (even though it is supposed to represent 
the procedure described above which would result in a weak value which is man- 
ifestly real), most subsequent authors replace it with its real part. The weak 
value calculated by [4] is (1.1) of DJ, written in a notation different from that 
of DJ (and the present paper) which is common in the "weak value" literature: 

Here ipi represents the initial state of the system S (called s above), -0/ the 
postselected final state (called s / above) , and stands for "weak value of A" . 
DJ's (1.2) is a generalization of the real part of (1.1) to mixed states: 

2'IKE"/i) 

We shall refer to either the real part of (1.1) or (1.2) as the "traditional" 
weak value (though I've not seen (1.2) in the literature prior to DAJ). Most of 
the "weak value" literature seems to implicitly assume that the traditional weak 
value is the only possible weak value. 

In the contextual value approach, it is natural to ask which collections of 
measurement operators {Mj{g)} will result in the traditional weak value in the 
limit g ^ 0. DAJ claims without adequate proof that this will occur when the 
measurement operators are positive. DJ formulates and attempts to prove a 
"General theorem" (GT) with this conclusion. One might roughly summarize 
the GT by the statement that the traditional weak value is essentially inevitable 
when the measurement operators are positive and commute with each other and 
the system observable A. 

Counterexamples to the GT are given in [6^. DJ attempts to refute these 
counterexamples by reinterpreting (but unfortunately not restating in a logically 
precise way) the hypotheses of the GT given in the first preprint version of 
DJ, larXiv: 1106. 1871^ ^1. to which [6] replied. The present work will make the 
reinterpretations explicit and give a new counterexample to the proof of the GT 
(though not to its conclusion) under its reinterpreted hypotheses. 



expectation of A must necessarily be a convex linear combination of the possible values (eigen- 
values) for A, whereas the conditional expectation of B may lie far outside this set, as the 
provocative title of [3] suggests. 
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2 Second introduction for those already familiar 
with the commented paper DJ [Hj 

The following, from here to the appendix, is essentially the "Comment" paper 
currently under review by J. Phys. A. It is not identical because a few exposi- 
tional improvements have been made, but there are no differences of substance. 

Notation will be the same as in the article under review [T], called DJ below. 
To compress this Comment to a traditional length, we must assume that the 
reader is already familiar with DJ. Its main purpose seems to be to justify a 
statement of ;2,, (called DAJ below) that a "general conditioned average" intro- 
duced in DAJ "converges uniquely to the quantum weak value in the minimal 
disturbance limit". DJ formulates "sufficient conditions" as hypotheses for a 
"General theorem" (GT) with this statement as its conclusion. 

For simplicity, we shall only consider the special case of DAJ and DJ's "min- 
imal disturbance" condition for which all measurement operators {Mj} are 
positive. (All statements will also hold for DJ's slightly more general defini- 
tion.) The associated positive operator valued measure (POVM) is {Ej} with 

Ej := M'^jMj. The measurement operators Mj — Mj{g) depend on a small 
"weakness" parameter g which quantifies the degree to which the measurement 
affects the system being measured. Our "minimal disturbance limit" will refer 
to the so-called "weak limit" g foi positive measurement operators. 

DAJ claims that under these assumptions, its "general conditioned average" 
(corresponding to what is more usually called a "weak" measurement followed 
by a postselection) is given by the traditional "quantum weak value" (the real 
part of DJ's (1.1)) in the weak limit g — J> oH 

"This technique leads to a natural definition of a general conditioned 
average that converges uniquely to the quantum weak value in the 
minimal disturbance limit." 

Counterexamples to this claim were given in [6] , examples which D J attempts 
to refute by reinterpreting the hypotheses of its "General theorem" (GT) given 
in the first version of DJ. larXiv:1106.1871t ^l. Unfortunately, DJ does not make 
explicit this reinterpretation, but some such reinterpretation is necessary for 
their objection to make sense. 

^The term "minimally disturbing measurement" for a measurement with positive measure- 
ment operators was used (and perhaps coined) in the recent book [8] of Wiseman and Milburn. 
This reference was unfortunately not cited in DAJ, which uses the term "minimal disturbance 
limit" without definition or intuitive explanation. I've not seen the term used elsewhere in the 
literature outside of DAJ and subsequent papers by its authors. The technical definition of 
the phrase "minimally disturbing measurement" as referring to a measurement with positive 
measurement operators" does not correspond to the meaning which one might assume from 
ordinary usage of the words "minimally disturbing" . (This is discussed in more detail in [5] , 
Section 11.) 

For simplicity of exposition, our definition of "minimal disturbance limit" will be essentially 
that of Wiseman and Milburn: the limit g —> for positive measurement operators, even 
though DJ uses a slightly more general definition. All statements will also hold for DJ's 
definition. 
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The mathematics of these counterexamples is undisputecf^ ; the only issue 
is whether they satisfy the hypotheses of DAJ or DJ. DJ correctly notes that 
the first counterexample using 2x2 measurement matrices does not satisfy what 
they call the "pseudoinverse prescription" , but DAJ does not clearly state this 
prescription as a hypothesised The second counterexample using 3x3 matrices 
does satisfy the pseudoinverse prescription, so the following will deal exclusively 
with this counterexample. 

Contrary to claims of DJ, this counterexample is valid when the hypothe- 
ses of the "General theorem" (GT) are interpreted as written, according to 
standard usage of logical language. However, DJ's attempted refutation of the 
counterexamples requires a great strengthening of one of these hypotheses, a 
strengthening not noted in DJ. We shall see that when so strengthened, the 
hypotheses of the GT seem very close to the assumption that the POVM must 
be a linear polynomial in the weak measurement parameter g, i.e., 

Ej{g) = ■* + gE^j ''where Ej and £^*' ■* are constant operators. (10) 

The analysis leading to this conclusion will be straightforward and simple. 
DJ's attempted proof of the GT is densely written, and our analysis of it must 
be correspondingly technical. Although probably few readers will be sufficiently 
familiar with the proof to convince themselves either of its truth or of the claim 
that there is a major error, I hope that the analysis may motivate anyone 
tempted to employ (or cite without comment) the "General theorem" to first 
carefully scrutinize its proof. 



3 Unstated hypotheses for the "General theo- 
rem" 

The hypotheses of the "General theorem" (GT) which will concern us are: 

"(in) The equality A — J2j (^j{9)^j{g) must be satisfied, where the 
contextual values oij{g) are selected according to the pseudo- 
inverse prescription. 

(iv) The minimum nonzero order in g for all Ej(g) is 5" such that 
(iii) is satisfied." 

"Minimum nonzero order" is not a standard mathematical phrase, but I take 
its occurrence in (iv) to mean that 



00 

4-(5)=^r+."E^r"v (11) 

k=Q 



^"However, DJ does correctly note a typo in the definition of one of the measurement 
operators in [6]; a -^1/3 had been mistakenly written as 1/3. However the correct value was 
used in the subsequent calculations, so apart from this single substitution, no other alterations 
in the argument of [6] are necessary. I thank the authors for this helpful correction. 

^^This is discussed in more detail in 6 . 
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with E constant operators and E^ 0. This is the way the phrase is 
used (just after DJ's equation (5.2)) in the attempted proof of the GT. Then 
the logical content of (iv) is that all Ej have the same minimum nonzero order, 
which is to be denoted n0 This is a strange and quite restrictive assumption 
for a "General theorem" , but it will not be our main concern. 

When the hypotheses of the GT are given the standard logical interpretation 
just described, the counterexample using 3x3 matrices which DJ attempts to 
refute is indeed a counterexample to the GT. However, DJ's attempt to refute 
the counterexample appears to assume something like the following. 

Denote by Ej{g) the truncation of the series to order n, namely, 

^;(5):=£f . (12) 

Then (iv) assumes (iii) with the Ej{g) in (iii) replaced by Ej{g), but 
with the contextual values aj{g) unchanged (i.e., the contextual values 
for the truncated POVM {Ej{g)} are the same as for the original POVM 
{Ej{g)}). More explicitly, it assumes that 

J2a,ig)E'^{g) = A, (13) 

i 

where the cij{g) satisfy the pseudo- inverse prescription for the truncated 
POVM. 

DJ's objection to the counterexample, given after its equation (7.4), is that 
it does not satisfy (fT3| . D J does not explicitly say that the contextual values for 
the truncated POVM are the same as for the original POVM, but that seems 
suggested by the fact that it uses the same symbols, aj{g) for both. Also, the 
details of DJ's attempted proof support that interpretation. 

Next recall that (iii) assumes that the contextual values d{g) — {ai{g), . . . , an{g)) 
satisfy the "pseudoinverse prescription" 

a = F+a, (14) 

where a is a list of eigenvalues for the system observable A, and F+ is the 
Moore-Penrose pseudoinverse for the matrix 

F = F{g) [E,{g) , . . . , E,,{g)]. (15) 

Here the column vector Ej{g) is the list of eigenvalues for Ej{g), and F is the 
matrix composed of those columns. Note that F is g-dependent, but we write 

^^The restrictive phrase "such that (iii) is satisfied" is logically redundant, since (iii) has 
already been assumed. If the authors mean that some alteration of (iii) is to be assumed, 
such as (iii) with the Ej replaced by their truncations to order n or (iii) with the original 
contextual values previously denoted Oj (g) replaced by others or some combination of these, 
then standard logical language requires that this be explicitly stated. I have considered several 
alternative interpretations of (iv), but all have led to inconsistencies with other parts of DJ. 
In the absence of requested clarification from the authors, I selected the one which seems most 
nearly consistent with the rest of DJ. 



10 



F = F{g) only when necessary to emphasize this point, to avoid possible confu- 
sion with the result of applying the matrix to a vector. If contextual values 
exist (in general, they don't), they are uniquely determined by the "pseudoin- 
verse prescription" ([T4| . 

Since the contextual values for the truncated POVM {Ej{g)} are assumed 
the same as those for the original and to also satisfy the pseudo-inverse pre- 
scription for the truncated POVM, we also have 

a = F'^a (16) 

with 

F' ■.= [E[{g),---,K{9)l (17) 

where the E'^{g) are the column vectors of eigenvalues for E^{g). Equation 
also uniquely determines the contextual values aj{g), so it would be surprising 

if both (fT4| and ((T6)) would hold except in the trivial case in which Ej = Ej 
for all j. In that case, we can make {Ej} linear (i.e., of form (1)) by replacing 
the parameter g hy a new parameter h :=(/", so for brevity we shall refer to 
this as the "linear case" . The hypothesis that both do hold seems very close to 
a hypothesis that the original POVM be linear. Indeed, I do not know of any 
example of a nonlinear POVM for which both ((T4| and (IT6|) can hold. 

4 Error or gap in proof 

Readers thinking of building on the work of DAJ and DJ may need to convince 
themselves of the validity of its "General theorem" . Since its attempted proof 
is densely written, it may help to pinpoint what I think is a critical error (or at 
least a serious gap), even under the strong hypothesis that the POVM is linear. 

This hypothesis is equivalent to the assumption that the matrix F = F(g) 
determining the contextual values a is first order in t;, in which case the min- 
imum nonzero order of F which the proof calls n is n = 1. To expose the 
gap, we use these assumptions to rewrite the questionable part of the proof 
in a simplified form. It applies to a matrix F with singular value decomposi- 
tion F = WEV^ , where E is a diagonal matrix and "C/ and V are orthogonal 
matrices" . All of these matrices depend on the weak limit parameter g. 

The contextual values a (which the proof renames do) are determined by the 
pseudoinverse prescription a = do = F^a, where a is the vector of eigenvalues of 
A. Here F'^ is the Moore-Penrose pseudoinverse of F, given by F+ = l^E+t/^, 
where E"*" is the diagonal matrix obtained from E by inverting all its nonzero 
elements. 

In reading the following, please keep in mind that if correct, it should apply 
to any matrix function F ~ F{g). Although an F — F{g) derived from a POVM 
has a special form given in part by DJ's preceding equation (5.9), nothing in 
the following proof fragment uses this special form. 

The proof mentions "relevant" singular values, but for brevity I have omit- 
ted the definition of "relevant" (which does not involve the special form of F) 
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because for the simple counterexample to be given, all we have to know is that 
a "relevant" singular value is a particular kind of singular value, as the syntax 
implies. The simplified proof fragment is: 

Since the orthogonal matrices U and V have nonzero orthogonal limits 
limg^o U = Uo and limg^o V = Vo , such that UqUo = VoVq^ = 1, and 
since a is g-independent, then the only poles in the solution qq = F'^a = 
VE+C/^a must come from the inverses of the relevant singular values in 
E+. 

Therefore, to have a pole of order higher than 1/g, there must be at least 
one relevant singular value with a leading order greater than . 

[This much seems all right, though many details are omitted, but I cannot 
follow the next and last paragraph of DJ's attempted proof.] 

However, if that were the case then the expansion of F to order would 
have a relevant singular value of zero and therefore could not satisfy (5.12), 
contradicting the assumption (iv) about the minimum nonzero order of 
the POVM. Therefore, the pseudoinverse solution ao ~ F^a can have no 
pole with order higher than 0{l/g) and the theorem is proved. 

If correct, the above proof fragment would imply that if a linear matrix function 
F{g) = P + gQ, with P and Q constant matrices, has a singular value with a 
leading order greater than g^, then it also has a singular value which is identically 
zero. (Put differently, the proof claims that if no singular value is identically 
zero for all g, then all singular values are 0(17^).) However, it is easy to construct 
counterexamples such as 



F := 



1 + g 1 
-1 -I + .9 



(18) 



which has singular values [g^ + 2 - 2^/g^ + 1]^/^ = g'^/2 + 0{g^) and [g^ + 2 + 
2n/7 + T]'/' = 2 + 5V2 + 0(54). 

Without performing the somewhat messy calculation of the singular values, 
one can see directly from Cramer's rule that since det F{g) — g^, F{g)^^ ^ g~^ 
which would make the contextual values a = F~^a asymptotic to g^^. The 
essence of the full proof of the GT is to show (continuing to assume n = 1 
for simplicity) that the contextual values are 0{l/g), which implies that the 
"numerator correction" of DJ's (5.7) vanishes in the limit g 0. 

Let us try to follow in detail the last paragraph of the proof in the context of 
the counterexample. Applied to the F of (|18p . the last paragraph asserts that 
if F has a singular value of order greater than g^ (which it does), then "the 
expansion of F to order g^ would have a relevant singular value of zero . . .". 
However, this is wrong because the expansion of F to order g^ is F itself, and 
all singular values are positive for g ^ 0. 

I suspect that the last paragraph of the attempted proof may be based on an 
erroneous implicit assumption that truncating a S corresponding to F{g) will 
produce the S for the truncated F, i.e., that truncation commutes with taking 
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of singular values. Otherwise, how could one possibly relate the S corresponding 
to F in (5.12) to the different E corresponding to the linear truncation of F? 
(Even assuming this relation, additional argument seems required to justify the 
last paragraph of the proof fragment.) 

To make the above more explicit, write S = ^(F) to indicate the dependence 
of the matrix S of singular values on F, and write t{F) for the linear truncation 
of F. I can begin to make sense of the last paragraph only by assuming that 

nriF)) = r(S(F)); 

which says that the singular values for the truncated F are the truncations of 
the singular values for F. The counterexample shows that this is false for its F 
which satisfies t{F) = F: 



i:{r{F)) = S(F) = 



2 + gV2 + 0(54) 

5V2 + 0(g4) 



2 




= r{i:{F)). 



Recall that ([T3|) was my best guess at the intended expansion of DJ's hypoth- 
esis (iv) from its logical meaning. (A direct request to the authors to confirm or 
correct this was ignored.) My next best guess would be that the aj{g) in (|13p 

might represent contextual values for the truncated POVM {Ej} that would 
not necessarily be contextual values for the original POVM {Ej}. However, the 
above objection to the proof would still apply. 

Whatever the intended meaning of DJ's (iv), in view of DJ's objection to the 
counterexample, it presumably imposes some condition on the linear truncation 
(still taking n = 1 for simplicity) of the original POVM {Ej{g)}. This seems 
an unreasonable hypothesis for a theorem billed as "General". Certainly the 
original claim of DAJ that its "general conditioned average" "converges uniquely 
to the quantum weak value in the minimal disturbance limit" gives no hint that 
unstated hypotheses necessary to validate the claim would fail to apply to simple 
cases such as the counterexample with POVM which is quadratic in g. 

DAJ gives the strong impression that the traditional weak value is essentially 
inevitable when the measurement operators are positive. DJ gives the same 
impression under the additional hypothesis that the measurement operators 
commute with each other and the system observable A. A main point of both 

and the present Comment is to dispel any such false impressions. 

It should be emphasized that ([T8|) is only a counterexample to DJ's at- 
tempted proof, not a counterexample to the conclusion of the GT under the 
assumption that the POVM is linear, i.e., of the form ([TU]). For a counterexam- 
ple to the conclusion, one would need an F which is derived from a POVM. 

Actually, I conjecture that the conclusion that the "general conditioned aver- 
age" is given by DJ's (1.2) (i.e., the traditional weak value generalized to mixed 
states) is true for linear POVM's under the pseudoinverse prescription. If so, 
its proof will surely have to use in some essential way the special form of an F 
which comes from a POVM (e.g., all rows sum to 1). 

I have sketched such a proof but have not written it in detail, so I make 
no claims. I will be happy to share the ideas of the proof with any qualified 
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person who might be interested in expanding on them. They are not difficult, 
but annoyingly detailed. If I decide not to write them up in journal-ready detail 
myself, I may put a sketch of a proof on my website, www.math.umb.edu/~sp. 

A main aim of this Comment is to focus attention on the case of lin- 
ear POVM's. If the conjecture is true, it might help to explain why (to my 
knowledge) actual experiments have only observed the traditional weak value 
despite the fact that arbitrary weak values can be ob- 
tained from different measurement procedures, as stressed by Dj[3 Since these 
experiments are difficult and have only recently been performed, perhaps they 
correspond to the simplest POVM's, e.g., linear POVM's arising from positive 
measurement operators. 

5 Appendix 1: Guesses at the meaning of hy- 
pothesis (iv) 

When I saw the grounds on which DJ disputed the counterexample of [5], I 
was stunned. Never had I even considered the possibility that hypothesis (iv) 
might refer to truncations, and had I considered it, I would have rejected it 
as implausible. I still find it hard to imagine that any careful reader could 
confidently assert that (iv) referred to truncations, much less be confident of 
any definite meaning regarding truncations. 

After DJ was accepted and I began to prepare this Comment, I have thought 
a great deal about possible interpretations of (iv). The authors' intended mean- 
ing is still not clear to me. All interpretations which I have considered are either 
logically unacceptable or inconsistent with some part of DJ. 

This appendix analyzes the interpretations which I have considered. I have 
debated whether it would be worth while to include it, since I imagine that few 
readers will be interested in investing their time in a detailed logical deconstruc- 
tion of (iv). I decided to include it for three reasons. 

First, it may serve to alert some readers to logical problems with the state- 
ment of the GT even if they choose not to study them in detail. Second, I hope 
that it may motivate the authors of DJ to state (iv) precisely in correct logical 
language in any reply to the Comment, so that readers can make informed de- 
cisions based on a definite knowledge of what DJ intended to assume. Third, 
since apparently no referees' report has yet been received over two months after 
submission^ I hope it may assist the referee. I assume that the referee who 
recommended acceptance of DJ without clarification of (iv) had not thought 
carefully about its logical meaning. 

Recall the hypotheses (iii) and (iv) of DJ's "General theorem" (GT) both 
as originally posted in arXiv:11 06.1871 vl and subsequently in DJ: 

^''See [7] or the list of references [10] of DJ. 

^■'The referee has my sincere sympathy. If he is conscientious enough to try to actually 
determine the correctness of DJ's densely written proof based on unclearly stated hypotheses, 
it will take far more time than is reasonable to ask of an unpaid volunteer. 
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"(iii) The equality A = aj{g)Ej{g) must be satisfied, where the 
contextual values aj{g) are selected according to the pseudo- 
inverse prescription. 

(iv) The minimum nonzero order in g for all Ej (g) is g" such that 
(iii) is satisfied." 

The statement of hypothesis (iv) is very peculiar, certainly not correct logical 
language. To analyze it, we need to review a few elementary principles of logic. 
What is the meaning of the statement: 

"x = 2" ? 

I surely hope that the reader mentally replied that it is meaningless in isolation. 
It is a so-called "open sentence" to which something must be added to give it 
meaning, i.e., to convert it into a logical statement which is either true or false. 

It can be given meaning by defining x before stating "a; = 2" . For example, 
if X were previously defined as: 

Let X denote the largest positive integer which satisfies the equation 

-x^ -24 = , 

then "x=2" would be a logically meaningful statement which would be definitely 
true or false (though we might not know which). 
To phrase the definition of x just above as 

X is the largest positive integer which satisfies the equation 

x'^ -x^ -24 = 

would not be correct logical language for a definition. Though some readers 
might be able to guess that it was intended as a definition of the symbol x, it 
is so far from accepted logical language that any logically trained person would 
have to question what was meant. It could not be justified as a logical shorthand 
because the previous correctly stated definition is no more complicated. This is 
analogous to (iv) with an inessential change of word order: (/" is the minimum 
nonzero order in g . . . . 

In (iv), the symbol n has not been previously defined, so if (iv) is not to be 
treated as meaningless, the best guess at the authors' meaning is probably that 
(iv) is intended as a definition of n. But (iv) is supposed to be a hypothesis (i.e., 
a logical statement assumed true), not a definition. 

Let us put (iv) aside for the moment to examine another logical principle. We 
noted that in isolation, "x = 2" is meaningless as a logical statement (because 
there is no way, even in principle, to assign it a truth value). One way to make it 
meaningful is to predefine x. Another is to prepend one of the so-called logical 
"quantifiers" V ("for all" or "for every") and 3 ("there exists"), e.g., 

V integers a;, a; = 2 (a meaningful statement which happens to be 
false) 
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or 

3 an integer x such that x — 2 {a meaningful though somewhat silly 
statement which happens to be true). 

In the second statement, the phrase "such that" is logically unnecessary (and 
customarily omitted in formal logic), but is added to make the sentence read 
well in English. Also, a necessary specification of the so-called "universe of 
discourse" (in this case that we are talking about integers) has been added to 
both statements. (If the universe of discourse had been previously specified, 
this would be unnecessary.) 

Having made these points explicit, let us return to (iv). Suppose we tem- 
porarily ignore the last clause in (iv) and for purposes of examination write the 
remainder by itself: 

(iv) The minimum nonzero order in g for all Ej(g) is 5" such that 

This is a strange wording which no logician would use, so I hate to analyze 
further without changing the word order to obtain more nearly correct logical 
language which (so far as I can guess at the authors' intention) carries the same 
logical meaning: 

(iv) For all Ej{g), the minimum nonzero order in g is such that 

For this to begin to make sense, we would have to know what is meant by 
"minimum nonzero order" , which is not a standard mathematical phrase. From 
the way it is used in the proof of the DJ's "General theorem" (GT) just after 
(5.2), I think that the only reasonable guess is that it means that 

Ejig) = Ef +g-E^;W0ig^^+') . (19) 

^ (k) (n) I — I 

with the E constant operators and Ej 0^3 The truncated statement (iv) 
just above is still not quite meaningful because n remains undefined, but no 
matter what integer n stands for, the statement does imply that all the Ej{g) 
have the same minimum nonzero order g". If we take this common minimum 
nonzero order as the definition of n, then the statement becomes meaningful. 
Though still strangely worded, it could reasonably be interpreted as saying that 
all the Ej{g) have the same minimum nonzero order, which is to be denoted n. 

But what of the restrictive clause beginning "such that" which we sup- 
pressed? This clause is 

"... such that (iii) is satisfied". 

But we have already assumed that (iii) is satisfied, so the restrictive clause 
implies no restriction at all. 



A request to the authors to confirm or correct this was ignored. 
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At this point, an experienced reader will beeome uneasy, and indeed I did. 
I asked myself why the authors would add a restrictive clause which was no 
restriction at all. Could the authors have some other meaning in mind, but have 
expressed it in a logically incorrect way? In trying to guess other meanings, I 
came up with only one plausible possibility, but it turned out to be inconsistent 
with something else in DJ. Next we will examine this possibility. If a referee did 
not think very carefully about possible meanings for (iv), it might superficially 
seem a reasonable possibility. 

One sometimes sees statements in the physics and mathematics literature 
similar to: 

= 1 + X + 12 to order . 

Could (iv) carry a similar meaning? 

Well, (iii) is already assumed to hold exactly^ so it holds to all orders in gi, 
so if wc! intcirpret (iv) as defining n as the smallest nonzero order to which (iii) 
holds, then (iv) would define n := 1 no matter what the POVM {Ejig)) was. 
It doesn't seem as if that would be the authors' intention. Otherwise, why not 
simply define n := 17 

Now we enter the realm of real guesswork . Could (iv) be intended to mean 
the following, or something like it? 

There exists a positive integer n such that (iii) holds with each Ej{g) 
replaced by its truncation to order n, i.e., if 

oo 

4(,) = ^^fV 

fc=0 

and we define Ej (g) by 

n 

^;.(,):=^^fV , 

then (iii) holds with the original Ej{g) replaced by their truncations 
^a,-^;.(5) = A , (iii)' 

3 

and moreover, n is defined to be the least positive integer for which equa- 
tion (iii)' holds. 

This is still not logically definite because we have to guess if the aj are the same 
as already defined by the original (iii), or arc defined by the psciidoinverse pre- 
scription (which is part of (iii)) applied to the new, truncated POVM {Ej {g)}, 
or both. 

By now the reader's head is probably spinning at the multiplicity of conceiv- 
able interpretations, but mercifully, we do not have to consider all of them in 
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detail. That is because for the counterexample, there are no aj{g) which satisfy 

(iii) ' for n = 1, as DJ shows. 

Therefore, the least n for which (iii)' could hold is n = 2, and it does 
hold for n = 2 because the counterexample is quadratic in g and satisfies (iii). 
Therefore, according to the interpretation of (iv) being considered, n = 2 for 
the counterexample. 

But DJ's objection to the counterexample requires that n = 1. The problem, 
I suspect, is that DJ may be simultaneously using two inconsistent definitions 
for n, the original definition of ([T9l) and the different definition introduced just 
above. DJ's objection to the counterexample is valid only if n = 1, but the 
objection also requires some interpretation of (iv) in terms of truncations. If 

(iv) is interpreted in terms of truncations as above, then n = 2. 

Of course, I cannot rule out the possibility that DJ may be using some 
wild interpretation of (iv) which I haven't even considered. But in terms of 
the above, DJ's objection to the counterexample is invalid. Having published 
"Sufficient conditions . . .", unless the authors withdraw their objection to the 
counterexample, they have a professional obligation to furnish an unexception- 
able statement of (iv), one which is clear and logically correct. Only then will 
readers will have the tools necessary to evaluate the counterexample and DJ's 
objection to it. 

All this would become moot if the authors recognize that DJ's attempted 
proof of the GT is in error or incomplete. But if they come up with a revised 
proof, they should give first priority to restating (iv) in a clear and logically 
correct way. 
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